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Abstract 

We introduce state-space models where the functionals of the observational and the evolu- 
tionary equations are unknown, and treated as random functions evolving with time. Thus, our 
model is nonparametric and generalizes the traditional parametric state-space models. This 
random function approach also frees us from the restrictive assumption that the functional 
forms, although time-dependent, are of fixed forms. The traditional approach of assuming 
known, parametric functional forms is questionable, particularly in state-space models, since 
the validation of the assumptions require data on both the observed time series and the latent 
states; however, data on the latter are not available in state-space models. 

We specify Gaussian processes as priors of the random functions and exploit the "look- 
up table approach" of Bhattacharya (2007) to efficiently handle the dynamic structure of the 
model. We consider both univariate and multivariate situations, using the Markov chain Monte 
Carlo (MCMC) approach for studying the posterior distributions of interest. In the case of 
challenging multivariate situations we demonstrate that the newly developed Transformation- 
based MCMC (TMCMC) of Dutta & Bhattacharya (2011) provides interesting and efficient 
alternatives to the usual proposal distributions. We illustrate our methods with a challenging 
multivariate simulated data set, where the true observational and the evolutionary equations are 
highly non-linear, and treated as unkown. The results we obtain are quite encouraging. More- 
over, using our Gaussian process approach we analysed a real data set, which has also been 
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analysed by Shumway & Stoffer (1982) and Carlin, Poison & Staffer (1992) using the linear- 
ity assumption. Interestingly, the results of analyses of this data using our methods, although 
agree with those obtained by the previous authors for a major part of the time series, diasagree 
towards the end of the time series. Since our methods very flexibly encapsulate both linear 
and nonUnear structures unUke the Unear approaches of the previous authors, the disagreement 
between the results show that towards the end of the time series, the linearity assumption of 
the previous authors breaks down. 

Keywords: Evolutionary equation; Gaussian process; Look-up table; Observational equa- 
tion; State-space model; Transformation-based Markov chain Monte Carlo. 

1. INTRODUCTION 

The state-space models play important role in dealing with dynamic systems that arise in various 
disciplines such as finance, engineering, ecology, medicine, and statistics. The time-varying re- 
gression structure and the flexibility inherent in the sequential nature of state-space models make 
them very suitable for analysis and prediction of dynamic data. Indeed, as is well-known, most 
time series models of interest are expressible as state-space models; see Durbin & Koopman (2001) 
and Shumway & Stoffer (201 1) for details. However, till date, the state-space models have consid- 
ered only known forms of the equations, typically linear. But testing the parametric assumptions 
require data on both the observed time series and the unobserved states; unfortunately, data on the 
latter are not available in state-space models. Moreover, the regression structures of the state-space 
models may evolve with time, changing from linear to non-linear, and even the non-linear struc- 
ture may also evolve with time, yielding further different non-linear structures. To our knowledge, 
there does not exist any approach in the literature that can handle such evolving, yet unknown, 
functional forms. These arguments point towards the need for developing general, nonparametric, 
approaches to state-space models, and this indeed, is our aim in this article. We adopt the Bayesian 
paradigm for its inherent flexibility. 

In a nutshell, in this work, adopting a nonparametric Bayesian framework, we treat the regres- 
sion structures as unknown and model these as Gaussian processes, and develop the consequent 
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theory in the Bayesian framework, considering both univariate and multivariate situations. The 
Gaussian process-based approach we develop here allows very flexible modeling of the unknown 
structures dynamically evolving with time, and gives particular importance to nonparametric dy- 
namic evolutions of the state-space models, providing a sound and very general theory for the 
latter. We also develop efficient MCMC-based methods for simulating from the resulting posterior 
distributions. 

Application of our methods to a challenging simulated multivariate data set with highly non- 
linear true observational and evolutionary equations (treated as unknown) yielded encouraging 
results. Aplication of our ideas to a real data set which has been analysed by Shumway & Stoffer 
(1982) and Carlin et al. (1992) assuming linearity, provided an interesting insight that, although 
the linearity assumption may not be unreasonable for most part of the time series, linearity does 
not hold towards the end of the time series, which the previous authors failed to take account of. 
Indeed, the results we obtained here using our nonparametric dynamic state-space model are in 
agreement with the results obtained by the previous authors for a major part of the time series, 
but towards the end of the time series, our results disagree with those obtained using the linear 
assumptions. Since our nonparametric model includes both linear and nonlinear structures, the 
discrepancy between the results suggests that the linearity assumptions are no longer valid here 
after a certain point of time. These studies also vindicate that our approach is indeed capable 
of modeling unknown functions even if the forms are changing with time, without requiring any 
change point analysis and specification of functional forms before and after change points. 

Before introducing our approach, we provide a brief overview of state-space models. 



2. 



OVERVffiW OF STATE-SPACE MODELS 



Generally, state-space models are of the following form: for i = 1, 2, . . ., 



(1) 



xt = gt{xt-i) + Vt 



(2) 
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In the above, ft and gt are assumed to be functions of known forms which may or may not ex- 
plicitly depend upon t; r]t, et are usually assumed to be zero mean iid normal variates. The choice 
ftixt) = FtXt and gt{xt-i) = GtXt-i, assuming known Ft,Gt, have found very wide use in the lit- 
erature. Obviously, Xt, yt may be univariate or multivariate. Matrix- variate dynamic linear models 
have been considered by Quintana & West (1987) and West & Harrison (1997) (see also Carvalho 
& West (2007)). Equation ([T]) is called the observational equation, while ([2]) is known as the evo- 
lutionary equation. Letting Dt = (z/i, • • • ,1/r)' denote the available data, the goal is to obtain 
inferences about i/t+i (single-step forecast), yx+k (k-step forecast), xt+i conditional on yx+i (fil- 
tering), xx-k (retrospection). In the Bayesian paradigm, the interests center upon analyzing the 
corresponding posteriors [yr+i \ Dt], [yr+k I Dt], [xt+i \ DT,yT+i] (also, [0:^+1 I Dt]) and 
[xT-k I Dt]. 

In the non-Bayesian framework, solutions to dynamic systems are quite generally available via 
the well-known Kalman filter. However, the performance of Kalman filter is heavily dependent 
on the assumption of Gaussian errors and linearity of the functions in the observation and the 
evolution equations. In the case of non-linear dynamic models, various linearization techniques 
are used to obtain approximate solutions. For details on these issues, see Brockwell & Davis 
(1987), Meinhold & Singpurwala (1989), West & Harrison (1997) and the references therein. The 
Bayesian paradigm frees the investigator from restrictions of linear functions or Gaussian errors, 
and allows for very general dynamic model building through coherent combination of prior and the 
available time series data, and using Markov chain Monte Carlo (MCMC) for inference. Bayesian 
non-linear dynamic models with non-Gaussian errors, in conjunction with the Gibbs sampling 
approach for inference, have been considered in Carlin et al. (1992); see Durbin & Koopman 
(2001) and Shumway & Stoffer (201 1) for comprehensive treatments of this area. 

We are not aware of any nonparametric approach to modeling the functions ft and gt in state- 
space models. One can anticipate spline/wavelet-based approaches but such approaches are not 
capable of modeling the unknown functions when the functional forms change with time. 

In the next section we introduce our novel nonparametric dynamic model where we use Gaus- 
sian processes to model the unknown functions ft and gt- Assuming the functions to be random 
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allows us to accommodate even those functions the forms of which are changing with time. We 
begin with one-dimensional Xt and yt, gradually generalizing to the multivariate setup in Section 
|5] In Section [6] we illustrate our models and methodologies using a multivariate simulated data, 
where the true multivariate observational and the evolutionary equations are highly nonlinear, apart 
from being multidimensional. In Section |7] we consider application to a real, univariate data set. 
We present a summary of the current work, along with discussion of further work in Section [8] 
Additional derivations and further details are provided in the supplement Ghosh, Mukhopadhyay, 
Roy & Bhattacharya (201 lb), whose sections have the prefix "S-" when referred to in this paper. 

3. NONPARAMETRIC DYNAMIC MODEL: UNIVARIATE CASE 
For any input x we represent ft{x) and gtix) as /(t, x) and g{t, x), respectively. In other words, 
we consider time t as input as well. We denote (t, x)' by x*. Crucially, we allow /(■, ■) and g{-, ■) 
to be of unknown functional forms, which we model as Gaussian processes. In the univariate 
situation (the multivariate case will be considered subsequently) our complete dynamic model has 
the following form 

yt = f{xlt) + et, Q~iV(0,a2), (3) 
= + ^t, r/t ~ Ar(0,a2), (4) 

where xl^ = (t^Xt)', and xlt_^ = (t,a;j_i)'. We assume that xq ~ A^(/i^o, cr^J; /i^o,^^^ being 
known. The functions / and g are modeled as independent Gaussian processes with mean functions 
= h{-)' I3f and iJig{-) = h{-)' I3g with h{x*) = (1, x*)' for any x*, and covariance functions 
of the form cr|c/(-, ■) and o-gCg{-, ■), respectively. The process variances are aj and cr^ and c/, Cg 
are the correlation functions. Typically, for any Zi,Z2, 0/(21,22) = exp{ — (21 — Z2)'Rf{zi — 
22)} and Cg{zi, 22) = exp{ — (21 — 22)'-Rg(2i — 22)}, where Rf and Rg are 2 x 2-dimensional 
diagonal matrices consisting of respective smoothness (or, roughness) parameters {ri j,r2j} and 
{ri r2^g}, which are responsible for the smoothness of the process realizations. These choices 
of the correlation functions imply that the functions, modeled by the process realizations, are 
infinitely smooth. The sets of parameters Of = {f3j, aj, Rf) and 6g = {f3g, cr|, Rg) are assumed 
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to be independent a priori. We consider the following form of prior distribution of the parameters: 
[l3f,a},Rf,l3g,al,rg,al,al] = \f3 ^,a},Rf][l3g,al,Rg\[al,al]. 

Note that, the distribution of Dt, conditional on . . . , x^ j, (equivalently, conditional on 
xi, . . . , xt), and the other parameters is multivariate normal: 

Dt ~ NT(HDrf3f, a} A f^Dr + (^IIt) , (5) 

where H']^_^ — [h{xl^i), . . . , h{xT^T)] , Af^^r is a T x T matrix with (i, j)-th element c/(x*j, x* j); 
= 1, . . . , T, and It is the T-th order identity matrix. 
The joint distribution of (xi, . . . , xt), however, is much less straightforward. Observe that, al- 
though we have [xo] ~ N{ijL^g,alJ, [xi I xo] ~ N{h{xoy (3g, a'^+a^),hut[x2 \ xi,xo\^[g{2,xi) + 
r)2 I xi, xo] = [g{2, g{l, xq) + 771) + 772 | g{l, xq) + rji, xo], the rightmost expression suggesting that 
special techniques may be necessary to get hold of the conditional distribution. We adopt the pro- 
cedure introduced by Bhattacharya (2007) to deal with this problem. The idea is to conceptually 
simulate the entire function g modeled by the Gaussian process, and use the simulated process as 
a look-up table to obtain the conditional distributions of {xi;i > 2}. 

3.1 Look-up table idea for facilitating derivation of conditionals and ensuring computational ef- 
ficiency 

For the purpose of illustration only let us assume that = for all t, yielding the model Xt — 
g{x* ^_^). The concept of look-up table in this problem can be briefly explained as follows. Given 
that the entire process g{-) is available (this means that for every input z, the corresponding g{z) 
is available), conditional on x^ f_i, xt — g{xlt-i) can be obtained by simply picking input xlf._-^ 
and reporting the corresponding output value g{x*^^_-^). This hypothetical simulation procedure 
suggests that conditional on the simulated process, it can be safely assumed that xt depends only 
upon 

Note that given xq we can simulate xi — g{x\ f^ ~ N{h{x\ Q)' I3g, a"^), which is the marginal 
distribution of the Gaussian process prior. Thus, Xi is simulated without resorting to any approxi- 
mation. It then remains to simulate the rest of the dynamic sequence, for which we need to simulate 
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the rest of the process {g{x*); x* ^ a:* q}- In practice, it is not possible to have a simulation of 
this entire set. We only have available a set of grid points = {^i, . . . , Zn} obtained, perhaps, 
by Latin hypercube sampling (see, for example, Santner, Williams & Notz (2003)) and a corre- 
sponding simulation of g, given by — {g{zi), . . . , g{zn)}, the latter having a joint multivariate 
normal distribution with mean 

E[D,\dg]^HD^^g (6) 

and covariance matrix 

V[D,\d,] = a'^A,,n., (7) 

where H'jj^=[h{zi), . . . , h{zn)] and Ag^^^ is a correlation matrix with the (i,j)-th element Cp(^i, zj). 
and = {cg{-,zi),. . . ,Cg{-,Zn))'. 

Given (xq, gixl^o)), we simulate from [D^ \ dg, g{x\ Q),XQ]. Since the joint distribution of 

[£>2, ^((xi q) I Xo] is multivariate normal with mean vector I " ^ I and covariance matrix 



h{xoyf3 



a 



where 



it follows that the conditional [D^ \ 5'(2^i,o)) ^i,o] ^ n-variate normal distribution with mean 
vector 

E[D, I g(xlo),xo, Og] = iJ,g,D, = HD,f3g + Sg,DM,o)(9Ko) - H^lfiYf^g) (9) 
and covariance matrix 

V[D,\g{xl,),xo,dg]^a'^i:g,D^ (10) 

where 

= Ag^D, - Sg^D, {xlo)Sg,D, {xI q)' (1 1) 

The probability that Xf ^_i will coincide with some grid point in G^, is zero. But the Gaussian 
process assumption provides a nice way of dealing with the problem. Assuming conditional in- 
dependence as discussed above, the conditional distribution [g{xl^_i) \ D^, Xt-i\ is normal with 



mean 

fxt = h{xl,_,)'(3^ + s,,D. - Hn.(3^) (12) 

and variance 

= a] {1 - s,,nA^l,_,yA-Xsg,DA^lt-i)] (B) 

Note that, thanks to the Gaussian process assumption, conditioning on forces the random 
function g{-) to pass through the points in (G^, Dz) since the conditional [g{x) \ 6g, Dz,x] has 
zero variance if x e (see, for example, Bhattacharya (2007) and the references therein). In 
other words, if xl^_-^ e Gz, then cr^ = so that 

[gi^lt-l) I Xt-l] = (14) 

where 5z denotes point mass at z. This property of the conditional associated with Gaussian pro- 
cess is in keeping with the insight gained from the discussion related to look-up table associ- 
ated with prediction of the outputs of deterministic function having dynamic behaviour. However, 
xl^_i ^ Gz with probability 1 and the conditional [g{x^^_-^) \ 0g, Dz, a^t-i] provides spatial inter- 
polation within [Gz, D^) (see, for example, Cressie (1993), Stein (1999)). Finer the set Gz, closer 
is [g{xlf._-^) I dg,Dz,Xt-]\ to 5gi^x*^_^)^ The conditional independence assumption of g{xl^_-^ of 
all {x*^ ^] i < {t — 1)} given {Gz, Dz) is in accordance with the motivation provided by the deter- 
ministic sequence and here Dz acts as a set of auxiliary variables, greatly simplifying computation, 
while not compromising on accuracy. One subtlety involved in the assumption of conditional in- 
dependence is that, conditional on x* q, G^ must not contain x* otherwise Dz would contain 
5'(a:* q) implying that [g{x1^^) \ dg, Dz, Xt] is dependent on xi = g{xl Q), violating the conditional 
independence assumption. 

To summarize the ideas, let Xq ~ tt, where tt is some appropriate prior distribution of Xq. The 
entire dynamic sequence {xo,xi = g{xl Q),X2 = 5'(a^2,i)> • • •} ^^^^ simulated using the 
following steps sequentially: 

(1) Draw xo ~ tt. 

(2) Given xq, draw xi = g(xlo) ~ N(h(xloy/3g, (^g)- 
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(3) Given xq, and xi = g{xl o), draw ~ [D^ \ Og, g{xl Q),xo]. 

(4) For t = 2, 3, . . ., draw Xt ~ [xt = g{xl^_^) \ 6g, D^, xt-i]. 

Step (1) is a simulation of xq from its prior, step (2) is simply drawn from the known marginal 
distribution of g(xl q) given xq. In step (3) is drawn conditional on g{xl g) (and xl q), con- 
ceptually implying that the rest of the process {g{x*); x ^ x^q} is drawn once g(xl q) is known. 
Step (4) then uses this simulated to obtain the rest of the dynamic sequence, using the assumed 
conditional independence structure. 

It is shown in Bhattacharya (2007), marginalization over is possible but the resulting com- 
putations become severely unstable numerically. In our state space modeling scenario, although we 
use the concept of look-up table illustrated above, the posterior distributions are much more com- 
plex and analytic marginalization over is not possible, since the posterior of is intractable. 
In any case, numerical instability is avoided in our situation. 

3.2 Look-up table-induced joint distribution of {xq, Xi, . . . , xt+i, Dz} 

Once Gz and are available, we write down the joint distribution of (D^, xq, xi, . . . , xt) con- 
ditional on the other parameters as 

= [xo][xi = gixlo) +T]i I xo,a'^][D, \ Og] 

T 

W[xt+i = g{x*t_^_i^t) + Vt+i I D„ Xt, Og, (15) 
t=i 

Recall that [xq] ~ N{fi^^,alJ, [xi = ^(x* o) + r/i | x* o,or2]~ iV(h(x* o)'/3g, + a^) and 
the distribution of is multivariate normal with mean and variance given by (|6]) and (|7]). The 
conditional distribution [xt+i = J + rjt+i \ D^-, Xt-, Og, cr.^] is normal with mean 

/i,, = ^(a;*+i,,)'/3, + Sg^D. (x.Vi^J'A-],^ - HD^^g) (16) 

and variance 

(^l, = + - Sg,DA<+i,tyA~g!D.Sg,DA<+i,t)] (17) 
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Observe that in this case even if xl_^_i ^ E Gz, due to the presence of the additive error term r]t+i, the 
conditional variance of xt+i is non-zero, equalling a^^ = a^, the error variance. Conditional on all 
the unknowns, the distribution of Dt is given by (|5]). In the next section we complete specification 
of our fully Bayesian model by choosing appropriate prior distributions of the parameters. 

3.3 Prior specifications 

We assume the following prior distributions: 

[log(r,,^)] ~ N (/i.,,,, ; for z = 1, 2. (18) 
[log(r,„,)] ~ N < J ; for 2 = 1, 2. (19) 

We] « {^!y^^^ exp |-^| ; "o7e > (20) 



[aj (X exp <j !> ; a„7, > (21) 



2a. 



V 



[aj] cx (aj)-^ )exp\-^^\;af,jf>0 (22) 



KV K)"^ ^exp|-^|;a„7,>0 (23) 

[/3;]~iV„(/3^_o,S;3,,o) (24) 
[/3J ~ iV„ (/3^,o, (25) 

All the prior parameters are assumed to be known. We discuss the choices of the prior parameters 
in the contexts of the specific applications. 

3.4 MCMC -based Bayesian inference 

In this section we begin with the problem of forecasting yr+i, given the data set Dt- Interestingly, 
our approach to this problem provides an MCMC methodology which generates inference about 
all the posterior distributions required, either as by-products or by simple generalization of this 
MCMC approach using an augmentation scheme. Details follow. 
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The posterior predictive distribution of Dt+i given Dt is 

[yT+i\DT\ = j[yT+i\ DT,xo,xi,...,XT+i,f3f,Rf,a'^^] 

x[xo,Xi, . . . ,XT+i,f3f,(3g,Rf,Rg,a^cr^ \ DT]d0fdegdcr^,a^dxodxi, . . .,dxt+i. 

(26) 

This posterior is not available analytically, and so simulation methods are necessary to make infer- 
ences. In particular, once a sample is available from the posterior [xq, xi, . . . , xt+i, /3j, f3g, Rf, Rg, (J^a'^ 
Dt], the corresponding samples drawn from [yr+i \ Dt-, xq, xi, . . . , xt+i, /3/, -R/, 0"^] are from 
the posterior predictive ( [26] ), using which required posterior summaries can be obtained. Note that 
the conditional distribution [yr+i = f{xT+i;T+i) + ^t+i \ Dt, xq, xi, . . . , a;r+i/3/, cr|, Rj, al] is 
normal with mean 

/^S/T+i = KXt+1,T+i)' f + ^f,DAxT+l,T+iy ^'f^DT^DT - Horf^f) (27) 

and variance 

=(^e+(^j{'^- Sf,DAxT+l,T+iyA]^DT^f,DAxT+l,T+l)} (28) 

Using Dz, the conditional posterior [xq, Xi, . . . , x^+i, (3^, Rf, Rg, a^, cr^ | Dt] can be 
written as 

[xo,xi, . . . ,XT+i,Of,0g,a'^,a'^ \ Dt] 

= J[xo,xi,...,XT+i,Of,eg,a^^,al,g{xlQ),D^\DT]dg{xlQ)dD^ (29) 
oc J [xo,xi, . . . ,XT+i,Of,0g,a^^,a^,g{xlQ),D;„DT]dg{xo)dD^ 
[ef,eg,ala^][xo][g{xlf^) \ xIq][D, \ ^(0:^,0), a;o, 6^9] 



(30) 



X [xi = gixl o) + r]i I g{xlQ),xo, f3g, a], a^] 
X [xt+i = gixT+^ T) + Vt I Og, crj, D^, xt] 

T-l 

Yll^t+i I Og,a'^,D^,Xt][DT \xi,.. . ,XT,df,al]dg{x\Q)dD^ (31) 



T-l 

X 

t=l 
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In the above, [xi = 5'(x^ o) + Vi I 9 {^lo), f3g,(T'^g,a'^] ~ N {g{xlo),cr^). Although the analytic 



form of (31 1 is not available, MCMC simulation from [xo,xi,. . . ,xt+i,0 f ,6 g,a'^ ,a'^,g{xlQ) ,D z\Dt] 



which is proportional to the integrand in (31 1, is possible. Ignoring g{xl q) and Dz in these MCMC 
simulations yields the desired samples from [xo,xi, . . . , xt+i,0 f,Og,a'^,a'^ \ Dt]. In the next sec- 
tion we provide the forms of the full conditionals necessary for MCMC simulations. The complete 



details are provided in Section S-9 



4. FORMS OF THE FULL CONDITIONAL DISTRIBUTIONS FOR MCMC SAMPLING 

IN THE UNIVARIATE SITUATION 

Let G,o = GzU {xl o}, D^o = {D'^,g{xlo)) , Ag^.o = 

H'g,D^o = lH'g,D^,h{xlQ)]. Clearly, [D,o \ x^_o,^g] = [^(2^*1,0) I Xo,eg][D, \ ^(a;l_o)> a^o, 
With these definitions, the forms of the full conditional distributions of the unknowns are provided 



and 
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below. 





■■] oc [/3/][£>r 1 xi,...,XT,^/,a,^] 




• ] OC [/3g][£)^o 1 a;i_o' ^g] \\[^t+i I ^g, crj, Dz, Xt] 

t=i 




■ •] oc [crj][i:>T \xi,.. . ,XT,^/,crf] 


[<^l 1 ■ 


T 

t=i 


[^l 1 ■ 


■■] oc [(r'^,][DT 1 Xi,...,xr,^/,or,^] 




T 
t=l 


Vij 1 ■ 


■] OC [rij][DT 1 Xi, . . . ,Xt,^/,o-,^]; i = l,2 




T 

■■] oc 1 ^(x*_o),xo,0g] 1 03,o-J,i:)^,Xt]; 




•] oc [g{,xlo) 1 a;o,/3g,a5[i:)^ | c/(xi^o). a^o, ki 1 ^(^^to 


[D. 1 ■ 


• ] OC 1 Og, aj, D^.xt] [D^ \ g{xl Q),xo, dg] 
t=i 


[3^0 1 ■ 

[xi 1 ■ 


■] oc [xo][D^o 1 2:0,^5] 

■] oc [xi 1 gixlo), Xo, f3g, a'^g, (r'}i][x2 \ Og^ol^D^^xi] 



X [Dt I Xi, . . .,XT,Of,al] 
oc [xt+i I Og,al,D^,Xt][xt+2 I ^g,aj,i:)2,a;t+i] 

X I xi,...,xr,6l/,a,']; t = 1,...,T-1 
oc [xt+1 I Og^al^D^^xr] 



(32) 
(33) 

(34) 
(35) 
(36) 
(37) 

(38) 
(39) 
(40) 
(41) 
(42) 

(43) 

(44) 
(45) 



Although some of the full conditionals are of standard forms permitting Gibbs sampling steps, 
others are non-standard and sampling requires Metropolis-Hastings (MH) steps in those situations. 



In Section S-9 we describe the Gibbs steps and construct proposal distributions when MH steps 
are needed. 
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4.0.1. MCMC-based single and mutiple step forecasts, filtering and retrospection Observe that 
one can readily study the posteriors [yr+i \ Dt], [xt+i \ Dt] and [xr-k \ Dt] using the readily 
available MCMC samples of {yr+i, xt+i, xx-k} after ignoring the samples corresponding to the 
rest of the unknowns. To study the posterior [xt+i \ Dt, Vt+i], we only need to augment yr+i to 
Dt to create Dt+i = (-D^, ?/t+i)'- Then our methodology can be followed exactly to generate 
samples from [xt+i \ -Dt+i]- Sample generation from [yx+k I Dx] requires a slight generalization 
of this augmentation strategy. Here we use successive augmentation, adding each simulated yx+j 
to the previous Dx+j-i to create Dx+j = {D'rp^j^^, yr+j)', j = 1,2, ... ,k. Then our MCMC 
methodology can be implemented successively to generate samples from yx+j+i and all other 
variables. This implies that at each augmentation stage we need to draw a single MCMC sample 
from [xo, xi, . . . , xx+j+i, /3/, (3g, Rj, Rg, aj, ag, cr^, | Dx+j]. Once this sample is generated, 
we can draw a single realization from yx+j+i by drawing from yx+j+i ~ N {^^yj,^^^^, a. 



r2 

VT+j + l 



where, analogous to (27 1 and (28 1, 



lI'VT+j+i - h{^X+j+l,T+j+l)'f^f + ^f,DT+ji^T+j+l,T+j+l)'Af}:)T+ji^T+j - Hor+jf^f) (46) 

and variance 

= + aj 1 1 - Sf,Dr+, (xKi+i,r+,+i)'^7iT+, */.^T+, (xt+,+i,t+j+i) } • (47) 

5 . EXTENSION TO MULTIVARIATE SITUATION 
Now we extend our nonparametric dynamic model to the case where both and Xt are multivari- 
ate. In particular, we assume that they are p-component and g-component vectors, respectively. 
Then our multivariate model is of the form 

Vt = fKt) + et, et~iVp(0,S,), (48) 
= g{xl^_-^) + rjt, 1]^^ Ng^xo,!:^), (49) 

where Xq ~ iV,(^i^^, S^J; fi^^, S^^ assumed known. In the above, /(■) = (/i(-), . . . , is 
a function with p components and g(-) = {gi{-), ■ ■ ■ ,gq{-))' is a function consisting of q compo- 
nents. We assume that /(■) is a p-variate Gaussian process with mean E[f{-)] = B'jh{-) and 
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with covariance function cov{f{zi), /(Z2)) = Cf{zi, Z2)Tif, for any g-dimensional inputs zi, z^. 
Here h{-) = (/ii(-), • • • , hmi-))' and = {f3^j, . . • ,/3pj), where, for j = 1, . . . ,p, f3jj are m- 
dimensional column vectors; clearly, m = q+2. Also, c/(zi, 22) = exp { — {zi — Z2)'Rf{zi — 22)}, 
where Rf is a diagonal matrix consisting of smoothness parameters, denoted by {rij, . . . , 
Similarly, we assume that g(-) is a Gaussian process with mean E[g(-)] = B'yh(-) and covariance 
function Cg{zi, Z2)Tlg = exp { — (21 — Z2)'Rg{zi — 22)}, the notation used being analogous to 
those used for description of the Gaussian process /(•)• 

5. 1 Multivariate data and its distribution 

The multivariate data is given by the following T xp matrix: = (Vi, 1/2, • • • ? Vt)' ■ Writing Dt 
vectorically as a Tp- vector for convenience, as D^p = (1/1,1/2, ■ ■ ■ , v'tI ^ follows that \Dtp I 
B f,llf, Rf, Eg] is a Tp-variate normal with mean vector 



E[Dtp I Bf, S/, Rf, Se] 



^ B'fhixl,) ^ 
B'fhixy 



\ B'fh{x*r^T) j 



^^Dt^ (say) 



(50) 



and covariance matrix 



V[Dtp I Bf, Sj, Rf, S^] 



Af^DT ®^f + lT®^e = ^Dr, (say). 



(51) 



Since (51 1 does not admit a Kronecker product form, the distribution of Dxp can not be written as 
matrix normal, which requires right and left covariance matrices forming Kronecker product in the 
corresponding multivariate distribution of the vector. It follows that [Vt+i = f i^T+i t+i) ~^ ^t+i \ 
Dt, Xq, Xi, . . . , xr+iBf, S/, Rf, S^] is p-variate normal with mean 



f^VT+i = B'fh{x*rp^^r^^^) + {Dt - HDTBf)'Af^^^Sf^DTi^T+i,T+i 



(52) 



and variance 



(53) 
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5.2 Distributions of g{xl q) and 

Conditional on xq, g^xl ^) is g-variate normal with mean B'^H^xI q) and covariance matrix S^. 

In contrast with the distribution of D^, Dz,nq = (fl''(^i)) ^'(^2)5 • • • , g'{Zn))' has an ng-variate 
normal distribution with mean 



E[Dz,nq I -^95^9)-Rfl 



\ B'^h{z^) j 



(say) 



(54) 



and covariance matrix 



(55) 



V[D,^nq I Bg, S3, Rg] = Ag^D. ^23 = S,5.,„, (say). 

Hence, the distribution of the n x g-dimensional matrix = {g{zi),g{z2), . . . , g{Zn))' is matrix 
normal: 

[D, I Bg, E„ Rg] ~ AC,, (if.Sg, E,) (56) 

Conditionally on (a^o, g{x\ q))? it follows that Dz is n x g-dimensional matrix-normal: 



[Dz I £?(a3t,o)> ^0, Bg, Eg, ilg, S^] ~ Un,q {Hg^D.^^9,D.,^g) 



(57) 



In (pTj) is the mean matrix, given by 



Mg,D, = HD^Bg + Sg^D^xlo^^aixlo)' - h{x\^)'B 



(58) 



and 



= Ag^D. - Sg,DAx*lfi)^9,DAxlQ)' (59) 

Here we slightly abuse notation to denote both univariate and multivariate versions of the mean 
matrix and the right covariance matrix by ^^'^ '^g,D^, respectively (see (j9j) and (111). 

5.3 Joint distribution of {xq, . . . , xt+i, Dz} 
Note that 



[xi I g{xo), xo, Bg, i:g] ~ Nq [gixl^), S^) 



(60) 
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and fort = 1, ... ,T, the conditional distribution [a^t+i = g{x*^^-^^^^)+r]^_^_-^ \ D^, Xt, Bg,llg, Rg,llri] 
is g-variate normal with mean 



and variance 



(61) 



(62) 



Since [xq] ~ Np {n^o^ '^xo) and the distribution of is given by (56) the joint distribution is 
obtained by taking products of the individual distributions. 

5.4 Prior distributions 

We assume the following prior distributions: 



For i = l,...,(g + l), 
For i = l,...,(p + l), 



[Bf I 



Rc 



oc SJ ^ 



exp 



oc 



oc lEj 



exp 



exp 



exp 

A/'m,p (^/,0, '^Bffli i^'^f) 
J^m,q {Bgfl, ^Bgfi, i'^g) 



; z^e > p - 1 
; > g - 1 
; z^/ > p - 1 
; i^g> q-l 



(63) 

(64) 
(65) 

(66) 

(67) 

(68) 

(69) 
(70) 



All the prior parameters are assumed to be known; and their choices will be discussed in the 
specific applications to be considered. 

The full conditionals are of analogous forms as those in the univariate case, provided by equa- 



tions from (32) to (45), but now the univariate distributions must be replaced with multivariate 
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distributions and multivariate distributions with matrix- variate distributions, and interestingly in 
some cases the full conditionals are not available in closed form although these were available in 
the one-dimensional version of the problem. For example, although the full conditional of /3j and 
(3g were available in the one-dimensional problem, the corresponding distributions of Bj and Bg 
are no longer available in closed forms. The problem of obtaining the closed form of the full con- 
ditional of B f can be attributed to the fact that ([5T]) can not be represented as a single Kronecker 



product. The non-availability of the closed form in the case of Bg is due to fact that the covariance 
matrices of Xt are additive. In fact, it turns out that only the full conditional of x^+i is of standard 



form. Details of our MCMC methods for the multivariate case are provided in Section S-10 



6. SIMULATION STUDY 
We consider a simulation study where we generate the data from a 4-variate model, constructed 
using the univariate nonstationary growth model of Carlin et al. (1992). To reduce computational 
burden in this multi-dimensional set up we generated 50 data points using the following model: 
fort = 1,...,50, 

xt,i = axt^i^i + (3xt-i,i/{l + xf_^ i) + 7cos(1.2(t - 1)) + rjt^i (71) 

Xt,2 = aXt-1,2 + /3Xt_i,2/(l + Xt„^2) + Vt,2 (72) 

xt,3 = a + (3xt-i,3 + r]t,3 (73) 
xt,4 = 7cos(1.2(t - 1)) + r]t,4 (74) 

yt,i = xy 20 + et,i] ^ = l,...,4, (75) 

where et^i and r]t,i are iid zero-mean normal random variables with variances 0. 1 . We set a = 0.05, 
/3 = 0.1 and 7 = 0.2, and Xq = 0. 

6. 1 Choice of prior parameters 

For the priors of B f and Bg, we set ijj = 1, B j q and Bg^ to be null matrices, and S Bj,o and ^Bg,o 
to be identity matrices. For the priors of S^, S.^, Sj, and S^, we set = p, = q, = p, and 
Ug = q. We chose = = 0.1/, and S/ = = 0.5/, where / denotes the identity matrix. 
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For the log-normal priors of the smoothness parameters we set /ij?^ = fiR^ = —0.5 and = 
cr|.^ = 1. The choices imply that the prior mean and the prior variance of each of the smoothness 
parameters are, respectively, 1 and 2 (approximately). In our simulation experiment these choices 
seemed adequate. In the case of real data analysis, in Section |7| however, we increased the prior 
variabilities, since in real data cases more uncertainty about smoothness is expected. 

6.2 MCMC implementation 

For updating the smoothness parameters we used normal random walk proposals with variances 
0.005. To set up the grid for the model-fitting purpose, we considered [—1.5, 1.5]^ to be a 
grid space for the 4-dimensional variable x. Indeed, this grid-space contains the entire true 4- 
dimensional time series. We divide [—1.5,1.5] into 100 equal sub-intervals and chose a point 
randomly from each of 100 sub-intervals, in each dimension, yielding n = 100 4-dimensional 
points corresponding to a;. As before, we chose the first component of the grid (corresponding to 
the time component) by uniformly drawing from each subinterval + 1]; i = 0, ... ,99. 



The block updating proposal for updating B f, described in Section S-10.1 worked quite well. 



but those for block updating Bg, g{xl q), D^, and those for the covariance matrices, as decribed 



in Section S-10 yielded poor acceptance rates. In order to update the covariance matrices S^, 
S^, Eg and S/, we considered the following strategy: we re-wrote the matrices in the form CC, 
where C is a lower triangular matrix, and used normal random walk with variance 0.005, to update 
the non-zero elements in a single block. This improved the acceptance rates. For Bg, g{xl q) and 
Dz, the strategy of block updating using random walk failed. As a remedy we considered the 
transformation-based MCMC (TMCMC), recently developed by Dutta & Bhattacharya (2011); in 
particular, we used the additive transformation, which requires much less number of move types 
and hence computationally less expensive compared to other, non-additive move types. Briefly, for 
each of the blocks, we generated ^ ~ N(0, 0.05)/(,^ > 0). Then, for each parameter in the block, 
we either added or subtracted ^ with equal probability. In other words, we used the same ^ to update 
all the parameters in a block, unlike the block random walk proposal. This considerably improved 
the acceptance rates. For theoretical and implementation details, see Dutta & Bhattacharya (201 1). 
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We discarded the first 10,000 iterations of the MCMC run as bum-in and stored the subsequent 
50,000 iterations for inference. Convergence is assessed by informal diagnostics, as before. It took 
an ordinary laptop about 24 hours to implement this multivariate experiment. 

6.3 Results of model-fitting 

Figures [T] and |2] show that our Gaussian process-based nonparametric model performs well in spite 
of the multidimensional situation — the true values are well-captured by the posterior distributions, 
and the true forecast values are also well-supported by the corresponding forecast distributions of 
x^i and y^i- Moreover, each component of the true 4-variate time series fall entirely within the 
95% highest posterior densities of the corresponding component of the 4-variate posterior time 
series. 

We now apply our ideas based on Gaussian processes to a real data set. 

7 . APPLICATION TO A REAL DATA SET 
Assuming a parametric, dynamic, linear model set up, Shumway & Stoffer (1982) and Carlin 
et al. (1992) analysed a data set consisting of estimated total physician expenditures by year 
iyut = 1, . . . , 25) as measured by the Social Security Administration. The unobserved true an- 
nual physician expenditures are denoted by Xt;t = 1, . . . , 25. Shumway & Stoffer (1982) used a 
maximum likelihood approach based on the EM algorithm, while Carlin et al. (1992) considered a 
Gibbs sampling based Bayesian approach. 

We apply our Bayesian nonparametric Gaussian process based model and methodology on the 
same data set and check if the results of the former authors who analysed this data based on the 
linearity assumption, agree with those obtained by our far more general analysis. 

7. 1 Choice of prior parameters and MCMC implementation 

For the choice of the parameters of the priors of a'j, a^, and cr^, we first note that the mean is of 
the form 7/(« — 2) and the variance is of the form 27^/{(a — 2)^(a — 4)}. Thus, if we set 7/(0 — 
2) = a, then the variance becomes 2a^/ {a — A). Here we set a = 0.5, 0.5, 0.1, 0.1, respectively, 
for (Tj, al, cr^. We set a = 4.01 so that the prior variance is of the form 200a^; we set 
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Figure 1 : Simulation study with data generated from the 4-variate variate modification of the 
model of CPS: Posterior distributions of Xq, cct+i (one-step forecasted x), and Ut+i (one-step 
forecasted y). The solid line stands for the true values of the respective parameters. 
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Figure 2: Simulation study with data generated from the 4-variate modification of the model 
of CPS: 95% highest posterior density credible intervals of the time series xij, . . . , xtj', j — 
1, 2, 3, 4. The solid line stands for the true time series. 
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a = 0.5, 0.5, 100, 100, respectively, for aj, ag, al and cr^. These imply that the prior expectations 
of the variances are 0.5, 0.5, 100, and 100, respectively, along with the aforementioned variances. 
The choices of high values of the prior expectations of and are motivated by Carlin et al. 
(1992), while the small variabilities of a'j and cr^ reflects the belief that the true functional forms 
are perhaps not very different from linearity, the belief being motivated by the assumption of 
hnearity by both Shumway & Stoffer (1982) and Carlin et al. (1992). Also motivated by the latter 
we chose xq ~ A^(2500, 100^) to be the prior distribution of xq. We set the priors of /3j and 
I3g to be trivariate normal distributions with zero mean and the identity matrix as the covariance 
matrix. For the prior parameters of the smoothness parameters we set cr^ = 100 and /i = — 50 
in the log-normal prior distributions, so that the means are 1 and the variances are exp(lOO) — 1. 



As mentioned in Section 6.1 , here we set high variance for the smoothness parameters to account 
for much higher degree of uncertainty about smoothness in this real data situation compared to the 
simulation experiment. 

To set up the grid Gz we noted that the MLEs of the time series obtained by Shumway & 
Stoffer (1982) using linear state space model are contained in [2000, 30000] . We divide this interval 
into 200 sub-intervals of equal length, and select a value randomly from each such sub-interval, 
obtaining values of the second component of the two-dimensional grid G^- For the first component, 
we generate a number uniformly from each of the 200 sub-intervals [i, z + 1]; i = 0, . . . , 199. 

We discard the first 10,000 MCMC iterations as burn-in and store the next 50,000 iterations for 
inference. We used the normal random walk proposal with variance 0.05 for updating o"/, cr^, and 
cr^ and the normal random walk proposal with variance 10 for updating {xq, Xi, . . . , xt}- These 
choices of the variances are based on pilot runs of our MCMC algorithm. As before, informal 
diagnostics indicated good convergence properties of our MCMC algorithm. It took around 17 
hours to implement this application in an ordinary laptop machine. 

7.2 Results of model-fitting 

Figures[3||4]and[5]show the posterior distributions of the unknowns. Our posterior time series of Xt 
has relatively narrow 95% highest posterior density credible intervals, vindicating that the linearity 
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beta_f_1 beta_f_2 beta_f_3 




-4 -2 2 4 -4 -2 2 4 0.8 0.9 1.0 1.1 1.2 




Figure 3: Real data analysis: Posterior densities of /3o,/, /3ij, /32,/, (3o,g, /3i,g, /32,g, o'f, ag, and a^. 
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Figure 4: Real data analysis: Posterior densities of cr^, rij, r2j, ri^g, r2,g, xq, xt+i (one- 
forecasted x), and ht+i (one-step forecasted y). 
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Figure 5: Real data analysis: 95% highest posterior density credible intervals of the time series 
xi,. . . ,xt. ThesolidlinestandsfortheMLEtimeseriesobtainedbyShumway and Stoffer (1982). 
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assumption is not unreasonable as claimed by Shumway & Stoffer (1982) and Carlin et al. (1992). 
Indeed, for such linear relationships, it is well-known that the Gaussian process priors that we 
adopted are expected to perform very well. 

However, perhaps not all is well with the aforementioned linearity assumption of Shumway 
& Stoffer (1982) and Carlin et al. (1992). Note that although the MLE time series obtained by 
Shumway & Stoffer (1982) fall mostly within our Bayesian 95% highest posterior density credible 
intervals of Xt, five observations towards the end of the time series fall outside. These observations 
correspond to the years 1968, 1970, 1971, 1972, and 1973. The values of yt in these years are 
11, 099, 14, 306, 15, 835, 16, 916, and 18, 200. We suspect that Unearity breaks down towards the 
end of the time series, which the models of Shumway & Stoffer (1982) and Carlin et al. (1992) 
fail to take account of. On the other hand, without any change point analysis, our flexible non- 
parametric model, based on Gaussian processes, is able to accommodate changes in the regression 
structures. 

8. CONCLUSIONS AND FUTURE WORK 
In this article, using Gaussian process priors and the "look-up table" idea of Bhattacharya (2007) 
we proposed a novel methodology for Bayesian inference in nonparametric state space models, 
in both univariate and multivariate cases. The Gaussian process priors on the unknown func- 
tional forms of the observational and the evolutionary equations allow for very flexible modeling 
of time- varying random functions, where even the functional forms may change over time, with- 
out requiring any change point analysis. We have vindicated the effectiveness of our model and 
methodology with a challenging multidimensional simulation experiment and a real data analysis 
which provided interesting insight into nonlinearity of the underlying time series towards its end. 

For our current purpose, we have assumed iid Gaussian noises and rjt; however, it is straight- 
forward to generalize these to other parametric distributions (thick-tailed or otherwise). It may, 
however, be quite interesting to consider nonparametric error distributions, for example, mixtures 
of Dirichlet processes; such a work in the linear cases has been undertaken by Caron, Davy, Doucet, 
Duflos & Vanheeghe (2007). We shall also consider matrix-variate extensions to our current work, 
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in addition to nonparametric error distributions. Computations for these extensions may be burden- 
some in the extreme, but we hope that the novel TMCMC theory proposed by Dutta & Bhattacharya 
(201 1) will be very useful in this regard. 

Supplement to "Bayesian Inference in Nonparametric Dynamic 

State-Space Models" 

Throughout, we refer to our main paper Ghosh, Mukhopadhyay, Roy & Bhattacharya (201 la) 
as GMRB. 

S-9. DETAILS OF MCMC SAMPLING IN THE UNIVARIATE SITUATION 
S-9.1 Updating /3 J using Gibbs step 
The full conditional of is m-variate normal with mean 



X 



[h'j,^ {ajAj^nr + oliy' Dt + 5^^;,o/3/,o} • (76) 



and variance 



(77) 



S-9.2 Updating /3g using Gibbs step 

The conditional distribution of (3g is m-variate normal with mean 

^ ^2 

^ ^ - Sg^DXx*t+i,t)'A-},D,) - H'j,A-^^^Sg,DXx*t+i,t)) 1 ^^g^ 

t=l '^9,V,t I 
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and variance 



T 



t=i 



a 



(79) 



In (78) and (79), 



(80) 



S-9.3 Updating a'j and cr| using MH steps 

The full conditionals of a'j and (j| are not available in closed forms and MH steps are necessary 
here. For proposal distributions we first construct a new model by setting = a'j and cr^ = cr^. 
For this model the full conditional distributions of a? and af. are inverse Gamma distributions, 



given by 



2r - 



/T+af+2 



exp 



2a 



(81) 



and 



<lol{<yl) oc [afj 

T 



a„+4+n+T 



exp 



a 



9,t 



(82) 



In (82) 



(83) 
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In (82) the terai ((t|) exp | — ^(a^i — 5'(a^i,o))^| ^^^o taken into account since, given cr^ = 
cr|, [xi I g{xl q), xq] ~ N{g{xl q), cr|). It is useful to remark here that unless aj ~ and ag ~ cr.^ 
these proposal mechanisms may not be efficient. We shall discuss other proposal distributions in 
the context of applications. 

S-9.4 Updating cr^ and cx^ using MH steps 

As before, the full conditionals of cr^ and cr^ are not available in closed forms. For MH steps, 
we construct proposal distributions obtained by setting aj = al and (j| = cr^. The proposal 
distributions are given by 



2 , 



exp 



(84) 



exp 



2cr2 

ri 



a 



g,t 



(85) 



In (85 1 the term (a^) ^"+^^/%xp j-^^ (i:>,o - i^D^o/^J' ^g.L, (-^-o - if d.o/^J } occurs since, 
given al = a'^, [D^q \ (3g, cr^, Rg] ~ A/'„+i [Ho^oPg, cr'^Ag^D^g) . Again, these proposals need not 
be efficient unless af ^ and ag ^ cr^. In the context of specific applications we shall discuss 
other proposal distributions. 

S-9.5 Updating Rf and Rg using MH steps 

For i = 1,2, the full conditionals of the smoothness parameters Vij and r^ g are not available 
in closed forms, and we suggest MH steps with normal random walk proposals with adequately 
optimized variances. 
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S-9.6 Updating qIxI q) using Gibbs step 

The full conditional of g{xl q) is univariate normal with mean and variance given, respectively, by 

-1 



1 1 + Sg^oMfiy^g^D^SgM^lo 



E[g{xl 



V 9 



X 



(86) 



and 



V^[^7Ko) I ■■■] = {-2+ 



O" (J 



(87) 



In (86) 



D: = D^- Hn.f3g + s,^n. {xl,)h{xl,y fB^ (88) 
S-9.7 Updating Dz using Gibbs step 

The full conditional distribution of is ri-variate normal with mean 

prn I 1 J ^g.k , 4-1 / ^g,gz(3^f+i,f)^g,g.(3^t+i,t) \ ^-i 

^ i-^- I ■ ■ ■ J = i + ^9,D. 1^ -2 ^9,D. 

I ^9 \t=l ^9,V,t J 



—^2 + ^9,D. 2^ 

(89) 



^" t=l ^9,V,t 



and variance 



-1 

S9.DAx:,-,,)S„DAXt 



fio, I = ^ E ^r" ^;,k (90) 



^9 ' \t=l ^9,V,t 



In (|89|) and (|90|), and 'Sg^o, are given by (9) and (10), respectively, of GMRB. 
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S-9.8 Updating xq using MH step 

Let f3f = {(3'qj, (5ij)' and (3g = {(3'^^, (5i^g)' , where (3qj and (3^^ are two-component vectors. 
Assuming the prior of Xq to be normal with mean ^^^^ and variance a^^^, and setting cr| = 0, the full 
conditional distribution of Xi is univariate normal, with mean and variance given, respectively, by 

^.^r^+g¥V'i¥+ '"-'°'>'^'4 . (91) 



1 , PI 



In (91 1, for any t, fc^ = (1, t)' . We use g^ol^^o) = (a;o : ■Co? 0o) the proposal distribution for 
updating xq using MH step. Observe that, under (t| = 0, g{x\ q) = h{x\ oYf^g with probability 
one; hence [xi \ g{xl Q), Xq, f3g, cr^] ~ N{h{x\ Qy (3g, a^), which has been taken into account while 
constructing the above proposal distribution. Note that this proposal will only be efficient when the 
■) is close to linear. As a result, for non- linear applications, we shall often use other proposal 
mechanisms, such as the normal random walk. 

S-9.9 Updating {xi, . . . , xt} using MH steps 

We construct proposal distributions for simulating {xi, . . . , xt} based on linear observational and 
evolutionary equations, setting a'j = a"^ = 0. Thus, for t = 0, . . . , T — 1, the proposal distributions 
of Xf+i are of the form qxt+i (a^t+i) = N {xt+i : 6+i, (t>t+i), that is, a normal distribution with mean 
^t+i and variance where the latter quantities are given by 

gXt + {xt+2 - K+2(^Q,a)Pi,a , iVt+i - K+il^oj)PiJ 
t,t+l - \ h -j^ ] \ \ 

(93) 



These proposal mechanisms will be efficient only if both /(■, ■) and g(-, ■) are close to linear. 
Hence, for non-linear situations, we shall consider other proposal distributions. 
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S-9.10 Updating xt+i using Gibbs step 

The full conditional distribution of xt+i is normal with mean and variance given, respectively, by 



E [xT+i !•••] = /i(xKi,t)'/3, + s,,D. (xT+i,T)'^;,k (^^ - HD.f3,) (95) 

and variance 

V [xt+1 \ ■■■] = al + a^g{l- Sg,D, (a:^r+i,T)'^;i, Sg,D, (^^t+i.t) } (96) 

S- 10. DETAILS OF MCMC SAMPLING IN THE MULIVARIATE SITUATION 
To obtain good proposal distributions, we use our old strategy of ignoring the error terms and 
r}^. Below we provide details of the proposal distributions used in each case. For our purpose, we 
abuse notation sUghtly as in Section 4 of GMRB, that is, we denote by G^o, D^o, Ag^^o and Hg^o^o 
the multivariate analogues of the quantities corresponding to the univariate situation described in 
Section 4 of GMRB. In other words, let Gzo = G U {xl ^}, D^o = (£>'^,sr(cci q)) ' ^p,zo = 

^9,0^^1,0)' 1 

S- 10. 1 Proposal distribution for updating B f 

Assuming Eg = S/ in our multivariate dynamic state-space model, the full conditional of Bf is 
m X p-variate matrix-normal: 

B/~A/'^,p(/U5^,Sb^,S^), (97) 
[h'^^ (A^,^, + It)-' Dt + r'^^loBf^o) , (98) 



where 



X 



and 



YlB, = [h'^, {Af^D, + It)-' Ho, + r'^B],o)~' ■ (99) 
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S-10.2 Proposal distribution for 

Assuming = S^, the conditional distribution of Bg is m x g-variate matrix: 



Bg ~ N'rn,q (yt^Bg^ S^^, S^j , 



(100) 



where 



T 



t=i 



a 



t=l 



and 



T 



t=i 



a 



9,t 



(102) 



In ( 101 ) and ( 102 1, a^^ is given by ( 83 1, with obvious notational change from the univariate Xt to 



the multivariate Xf. The corresponding changes from x* to x* are also self-explanatory. 



S-10.3 Proposal distributions for updating S / and 

Setting = and = S^,, we obtain the following inverse Wishart proposal distributions of 
and S^: 



oc iS/fV ^ ; exp 



+ {Dt - Ho^Bf)' {Af^Dr + ItV' {Dt - HD^Bf)) }] (103) 
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and 



_ ' Vg + q + 2+m + n+T 

gs,(S,)cx|S I V 



exp 



t=l 9,t 



(104) 



S-10.4 Proposal distributions for updating S^, S^, Rf and Rg 

As in the univariate case, here we set Sj = and = S^. Then the proposal distributions of 
and are given by the following: 



X exp 



(105) 



exp 



E - B'gKxl^,,) - {D, - HnB,)'A-},^Sg,nA^;^i,t)} 



t=l 



X 



{xt+i - B'yh{xl^^ ^) - [D, - HDBg)'Ag}j^Sg^DA<+i,t 



(106) 



As in the case of the corresponding univariate proposals, in ( 106 1 the factor 

1 



(2) exp 



-trT.-\D,Q-Hn..,.B^' A-^ 



D,0-Dg) ^g^D^o y^^O - Ho^oBg) 



occurs because under Eg = S,,, [D^o \ Bg, Eg, Rg] ~ Mn,q {Ho.oBg, Ag^D^o, S^). 

The full conditionals of the smoothness parameters vn Rf and Rg are not available in closed 
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forms, and as before we suggest Metropolis-Hastings steps with normal random walk proposals 
with adequately optimized variances. 



S- 10.5 Proposal distribution for updating g{xl Q) 

Assuming = Eg, the full conditional of gixl ^) is ^-variate normal with mean and variance 
given, respectively, by 

K = E[g{xl,) I • • • ] = {2 + s,,nA^l,yj:-},^s,,nA^lo)}-'^9 

X {x^ + B'^h{xl,) + D*'1:-},^s,,dA^I,)} 



(107) 



and 



(108) 



In the above, 'Sg,D^ is given by (59) of GMRB, and £>* is given by 

Dl^D,- HuB, + s,,uA^lo)h{^lo)'B, (109) 
We consider qg{xi Q){g{x\^Q)) = Nq {g{x\^Q) : /Lt*, S*) as the proposal distribution for updating 

S-10.6 Proposal distribution for updating 

The full conditional distribution of in our dynamic model after setting = S^, is matrix- 
normal: 

(110) 

where 



(111) 



t=l ^9,t 
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and 

St, = {s-L . + . (t '""-'"'^"g;"-'"'^"'' ) ^-kj ' (112) 
In the above, /^^^d^ ^a,Dz ^^e given by (58) and (59) of GMRB, respectively. The distribution 



(110) will be used as proposal distribution to update Z?^ using Metropolis-Hastings step. 



S-10.7 MH proposal for updating Xq 

For j = l,...,p, let (3jj = /,2)'- Likewise, for j = 1, . . . , q, let (3jg = {(3'^^-^, fi'j^g^2)' ■ 

Here /3j / 1 and l^j^g^i are two-dimensional and /3jj,2 ^^'^ f^j,g,2 g-dimensional vectors. Also let 
= • • • ' /5p,/,i)' ^/ = (/5i,/,2' • • • ' /3p,/,2)'- Similarly, = (/3i,g,i, • . • , and 

= if3i g 2, • • • , /3g,c,,2)'- Thus, a/ and are p x 2 and g x 2-dimensional matrices respectively, 
while bf and bg are, respectively, p x q and g x g dimensional matrices. Then, B'jh(x*) = 
ajki + bfX and B'^h^x*) = a^fci + b^x, for any g-dimensional x. With these representations, 
and setting Eg = 0, the full conditional distribution of xq is g-variate normal with mean and 
variance given, respectively, by 



E[xo I ■ ■ ■] = {i:-^ + b'^^,-\}-' {S-V,„ + b;S,-^ {x, - agk,)} (113) 

v[xo\---] = {i:7j + b'^i:,-%}-\ (114) 

S-10.8 MH proposals for {xi, . . . , x^} 

Ignoring the errors of the functions / and g, that is, setting S/ = Eg = 0, it turns out that the 
proposal distribution of Xt+i, for t = 0, . . . , T — 1, can be taken as a g-variate normal distribution 
with mean and variance given, respectively, by 

E[x,^, I ■ • •] = (s,-i + 6;s-i6; + b'^i:;%)-' 

X {S^^ {agkt+i + bgXt) + bgS^^ {xt+2 - agkt+2) + - a/fct+i)} 

(115) 

v[x,^, I • ■ •] = (s^-i + b;s-ib; + b'fYi-^'bf)-' . (116) 
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S-10.9 Gibbs step for updating xt+i 

The full conditional distribution of Xt+i is g-variate normal with mean 
and variance 

The caveat for using these proposal distributions is that if the underlying assumptions = E^, 
T,g — and Eg = do not hold, then the proposal mechanisms will turn out to be much less 
effective, and more so compared to the univariate cases, due to the curse of dimensionality. So, we 
shall also consider other appproaches to these updating procedures, to be discussed in the context 
of the specific applications. 
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